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1. 


INTRODUCTION 


The  initial  involvement  of  Meteorology  International  Incorporated 
with  analogue  prediction  techniques  developed  out  of  years  of  association 
with  the  FNWC  Optimum  Track  Ship  Routing  (OTSR)  System.  During  this 
association  it  became  apparent  that  the  quality  and  timeliness  of  the 
product,  based  in  part  on  analogue  techniques,  left  much  to  be  desired 
from  the  point  of  view  of  operational  utility.  Existing  analogue  selection 
and  compilation  techniques  were  crude  and,  in  terms  of  computer  processing 
time,  were  cumbersome  and  costly. 

Confronted  with  this  situation.  Mil  became  directly  involved  in 
analogue  research  and  development  as  from  January  1973  with  the  objective 
of  devising  and  implementing  an  analogue  selection  scheme  which  was  both 
rapid  and  based  on  comprehensive  and  realistic  selection  criteria.  As  the 
system  was  developed  and  its  early  capabilities  explored,  it  became 
apparent  that  analogue  selection  based  on  the  total  hemisphere  could  only 
be  useful  in  the  broadest  terms  because  of  the  great  variability  in  synoptic 
patterns  occurring  simultaneously  over  the  hemisphere.  (The  data  base 
required  to  provide  a  reasonable  number  of  good  analogues  if  trying  to 
match  the  hemisphere  as  a  whole  is  far  greater  than  that  available.)  Because 
of  this  variability,  even  the  top-scoring  hemispheric  analogues  had  little 
relevance  to  any  operationally-slgnificant  analogue  forecast  for  a  local  area 
such  as  the  Mediterranean  Sea.  What  was  required,  of  course,  was  the 
ability  to  focus  on  any  pre-selected  region,  taking  into  account,  when 
selecting  analogues,  only  those  essential  features  of  the  space  and  time 
scales  of  the  atmospheric  disturbances  likely  to  affect  the  region  during 
the  objective  period  of  the  forecast. 

The  first  efforts  aimed  at  regionalizing  the  analogue  selection 
system  were  carried  out  on  behalf  of  NEPRF  for  the  Mediterranean  region; 


subsequently,  further  work  was  performed  for  TNWC  directed  toward  the 
eventual  goal  of  a  multi-regional  capability. 

These  first  efforts  laid  only  the  foundations  of  the  Regionalized 
Rapid  Analogue  Selection  System  (RASS);  further  development  work  was 
required.  This  requirement  was  recognized  and  in  Juno  1976,  under  NEPRF 
sponsorship,  Mil  was  awarded  Contract  No.  N00228-76-C-3189  to  continue 
with  the  development  of  RASS  with  specific  emphasis  on  its  application  to 
the  Mediterranean.  Under  the  terms  of  this  Contract,  the  work  was  to  be 
carried  out  by  the  performance  of  Tasks  1  and  2,  the  objectives  of  these 
Tasks  being  detailed  in  Section  2.  Task  1  was  finished  in  December  1976, 
an  Interim  Report  being  delivered  to  NEPRF  on  completion.  Task  2  has  now 
been  completed,  and  this  Final  Report  presents  the  methods  developed  and 
the  results  obtained  in  fulfillment  of  both  Task  1  and  Task  2. 


2. 


OBJECTIVES 


The  overall  purpose  of  this  project  was  to  improve  the  regionalized 
rapid  analogue  selection  capabilities  with  emphasis  in  the  Mediterranean 
area.  The  specific  objectives  of  the  work  were  essentially  as  follows: 


2.1  Task  1 


a.  Re-examine  and  modify  the  components  presently  used  in  the 
regionalized  rapid  analogue  selection  scheme. 

b.  Design  an  optimum  storage  configuration  -»nd  construct  a  new 
history  data  base  to  incorporate  necessary  ata  resolution 
while  significantly  reducing  total  data  tape  handling  in  the 
current  data  base  of  28  years. 

c.  Compute  a  climatology  of  each  component  field  in  addition  to 
the  history  base. 

d.  Develop  techniques  which  provide  effective  measurement  of 
both  large-  and  small-scale  characteristics  in  both  space  and 
time  of  the  component  fields. 

e.  Design  selection  techniques  so  as  to  be  pertinent  for  analogues 
covering  the  Mediterranean  region,  but  with  capabilities  to  be 
modified  for  any  predetermined  region  in  the  Northern  Hemisphere. 

f.  Producean  operational  program  to  be  run  on  the  Fleet  Numerical 
Weather  Central  (FNWC)  CDC-6500  computer  system. 

g.  Design  methods  for  both  tuning  the  regionalized  rapid  analogue 
scheme  and  for  verification. 

h.  Write  an  interim  report  for  the  internal  use  of  the  Naval 
Environmental  Prediction  Research  Facility. 


2.2  Task  2 


a.  Expand  upon  the  work  Initiated  under  Task  1  with  increased 
tuning  of  the  program. 

b.  Thoroughly  demonstrate  and  evaluate  at  least  two  historical 
periods  using  data  furnished  by  NEPRF. 

c.  Design  a  continuing  program  for  verification  statistics  of  the 
regionalized  rapid  analogue  scheme. 

d.  Produce  a  final  report  to  conform  with  MIL-STD-847A ,  Formal 
Requirements  for  Scientific  and  Technical  Reports  Prepared  by 
or  for  the  Department  of  Defense.  31  Ian  1973.  The  results 
of  the  demonstration  and  evaluation  of  the  historical  periods 
should  be  presented  as  case  studies. 


-4- 


3. 


THE  ANALOGUE  APPROACH  TO  ENVIRONMENTAL  FORECASTING 


3. 1  Terminology  Used  in  This  Report 

A  METEOROLOGICAL  SITUATION,  defined  as  occurring  at  a  fixed  point  in 
time,  may  be  represented  and  comprehended  by  an  assemblage  of 
SPECIFYING  PARAMETERS. 

A  SCENARIO  is  a  METEOROLOGICAL  EPISODE  or  SEQUENCE,  defined  as  a 

(normally  brief)  time-connected  series  of  situations.  The  specifying 
parameters  involve  time. 

A  METEOROLOGICAL  EVENT  may  be  either  a  situation  or  a  scenario,  as 
defined  above. 

An  ANALOGUE  is  a  meteorological  event  selected  from  historical  records 
as  being  acceptably  similar  (according  to  pre-established  criteria 
involving  the  specifying  parameters)  to  another  event. 

The  BASEDAY  is  the  meteorological  event  for  which  analogues  are  to  be 
selected.  For  forecasting,  either  the  current  situation  or  the 
current  scenario  would  be  used.  For  hindcasting  a  baseday  event 
would  be  chosen  from  historical  records. 

The  ANALOGUE  CANDIDATE  is  the  particular  event  being  compared  with  the 
baseday  to  assess  its  suitability  for  selection  as  an  acceptable 
analogue. 

MATCHING  is  the  process  of  comparing  the  chosen  baseday  with  all  analogue 
candidates  in  order  to  select  analogues.  Matching  is  performed  by 
comparison  of  corresponding  specifying  parameters. 

The  ANALOGUE  SCORE  is  the  number  assigned  to  an  analogue  candidate  as 
a  result  of  the  matching  process,  this  number  being  a  measure  of 
the  overall  degree  of  matching  or  similarity. 


PERSISTENCE  FORECAST.  A  forecast  method  based  on  the  assumption  that 
meteorological  conditions  during  the  forecast  period  remain 
unchanged  from  those  prevailing  at  the  beginning  of  the  forecast 
period.  A  persistence  forecast  may  be  taken  as  demonstrating 
zero  skill,  thus  providing  a  basis  for  determining  the  effectiveness 
of  other  forecast  techniques. 

CLIMATOLOGICAL  FORECAST.  A  forecast  regarding  the  future  value  of  a 
meteorological  parameter,  couched  in  terms  which  relate  stated 
ranges  of  that  parameter  to  their  percentage  probability  of  occurrence 
during  the  forecast  period,  based  entirely  on  statistics. 

DETERMINISTIC  FORECAST.  A  forecast  which  gives  only  what  is  considered 
to  be  the  most  probable  future  value  (or  narrow  range  of  values)  of 
a  meteorological  parameter.  In  general  no  additional  information 
is  provided  by  which  to  assess  the  actual  probability  associated  with 
the  forecast,  this  assessment  being  left  to  the  user — a  process 
requiring  considerable  experience  on  the  part  of  the  user.  A 
deterministic  forecast  is  therefore  an  incomplete  statement  of 
available  information. 

PROBABILISTIC  FORECAST.  A  forecast  expressed  in  terms  which  distribute 
the  full  probability  (100%)  over  the  entire  range  of  possible  future 
values  of  a  particular  meteorological  parameter.  A  probabilistic 
forecast  is  therefore  a  complete  statement  of  available  information. 

3.2  Discussion  and  Outline  of  the  Analogue  Approach 
3.2.1  General  Approach 

The  analogue  approach  to  meteorological  forecasting  is  based  on  an 
ability  to  recognize  significant  degrees  of  similarity  between  events  which 
have  occurred  in  recorded  meteorological  history  and  the  current  event. 


An  historical  event  recognized  as  being  an  acceptably  close  match  to  the 
current  event  (or  to  another  chosen  historical  event)  is  termed  an  "analogue". 
The  underlyvng  premise  is  that,  given  sufficient  and  relevant  similarity,  an 
analogy  may  be  drawn  between  what  did  follow  from  the  selected  historical 
events,  and  what  will  follow  from  the  current  event. 

Assuming  this  premise  is  accepted,  it  follows  that  any  effective 
analogue  forecasting  system  must  include  the  following  basic  components: 

a.  A  methodology  for  interpreting  any  meteorological  event  in 
terms  of  relevant  specifying  parameters. 

b.  A  data  base,  expressed  in  terms  of  the  specifying  parameters, 
which  is  sufficiently  large  to  encompass  the  range  of  significant 
variabilities  which  have  occurred  in  meteorological  history  and 
which  may  possibly  occur  (within  reason)  during  the  forecast 
period . 

c.  A  system  for  comparing  the  selected  meteorological  event  with 
all  others  in  the  data  base  in  order  to  select  analogues.  In 
practice  of  course,  any  practical  scheme  will  find  degrees  of 
similarity  ranging  from  very  good  matches  (hopefully) ,  to  very 
poor  matches.  Thus  the  matching  technique  must  incorporate 
a  scoring  system,  allowing  the  analogue  candidates  to  be 
ranked  in  order  from  the  best  fit  to  the  worst  fit. 

d.  A  method  of  compiling  a  forecast  from  the  selected  analogues 
and  their  ensuing  scenarios. 

Note:  As  far  as  is  known,  no  attempt  has  been  made  previously 
to  select  analogues  based  on  a  baseday  scenario,  only- 
on  a  baseday  situation.  The  ability  to  match  scenarios, 
described  in  this  Report,  is  a  development  unique  to  MU. 
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In  essence  the  analogue  approach  is  one  of  compiling  a  day-by-day 
"selective  climatology" — the  selection  process  eliminates  those  developments 
unlikely  to  ensue  (based  on  meteorological  history)  from  the  current  scenario 
or  situation,  choosing  only  those  developments  which,  in  the  past,  have 
evolved  from  events  similar  to  those  currently  taking  place.  Clearly  any 
skill  used  in  selecting  the  appropriate  developments  and  from  them  compiling 
a  day-by-day  "climatology",  must  provide  a  more  skillful  probabilistic 
forecast  than  using  the  complete  climatology  which  incorporates  a_H  scenarios, 
including  those  recognizable  as  being  unlikely  to  evolve  from  current  events. 

If  an  analogue  selection  system  fails  to  demonstrate  this  increase  in  skill 
then  it  follows  that  the  design  of  the  system  is  such  that  no  skill  is  being 
used  in  the  overall  selection  process. 

As  envisaged  by  Mil,  a  major  use  of  a  successful  analogue  forecasting 
scheme  lies  in  the  compilation  of  extended  range  forecasts  —  say  from  3  to  10 
days.  Out  to  3  days  the  various  numerical  analysis  and  forecast  models, 
aided  by  the  subjective  skills  of  the  experienced  forecaster,  demonstrate 
considerable  skill  over  persistence  or  climatology,  this  skill  decreasing 
rapidly  with  lapsed  time;  after  about  3  days  any  deterministic  skill  can  only 
be  expressed  in  gross  terms.  Although  not  displaying  the  initial  skill 
available  from  numerical  models,  the  probabilistic  skill  provided  by  an 
effective  analogue  system  should  degrade  more  slowly  with  time,  providing 
more  meaningful  forecasts  than  numerical  models  after  about  3  days.  With 
current  technology  and  understanding  it  seems  unlikely  that  any  operationally 
significant  forecasting  skill,  superior  to  say  a  monthly  climatology,  can 
exist  much  beyond  10  days,  although  it  may  be  possible  to  provide  "trend" 
forecasts  for  longer  periods. 

Various  analogue  forecasting  methodologies  have  been  designed 
using  the  four  basic  components  outlined  above.  Meaningful  comparison 
of  the  effectiveness  of  these  systems  is  made  difficult,  if  not  impossible, 
by  the  fact  that  the  objective  of  each  system  usually  differs  from  that  of 
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other  systems.  Howe-  ;,  none  has  demonstrated  sufficient  skill  to  warrant 
their  sustained  use  in  any  operational  context  without  further  deveiopment. 

3.2.2  The  Mil  Approach 

3.2.2. 1  Regional  Focus 

Most  weather  elements  of  operational  significance  (e.g.  ,  winds, 
waves,  clouds,  precipitation,  fog,  etc.)  are  the  result  of  synoptic-scale 
disturbances  in  space  and  time.  On  this  scale  the  range  of  variabilities 
encountered  on  a  hemispheric  basis  is  so  great  that  the  available  data  base 
of  meteorological  history  (30  years)  is  insufficient  to  provide  analogues 
unless  the  selection  criteria  are  made  so  coarse  that  synoptic-scale 
disturbances  play  little  part  in  deciding  analogue  selection.  As  mentioned 
in  Section  1 ,  the  alternative  approach  adopted  by  Mil  is  to  focus  on  a 
region,  such  as  that  determining  the  meteorological  events  affecting  the 
Mediterranean  Sea,  making  no  attempt  to  match  irrelevant  external  events. 

(It  will  be  appreciated,  of  course,  that  to  produce  analogue  forecasts  for 
the  Mediterranean  Sea,  a  region  considerably  larger  than  the  Mediterranean 
itself  must  be  considered.) 

3. 2. 2. 2  The  Available  Historical  Data  Base 

The  Mil  system  for  analogue  forecasting  has  been  designed  to 
exploit,  on  a  regional  focus  basis,  the  information  and  resolution  contained 
in  the  available  history  of  synoptic  fields.  The  available  archived  records 
consist  of  six  component  fields  for  the  whole  of  the  Northern  Hemisphere  for 
each  date-time  group.  These  are  the  three  component-range-of-scalex  (SV, 
SL  and  SD)  fields  for  the  500-mb  and  1000-mb  height  fields.  An  additional 
three  thickness  fields,  one  for  each  scale  component,  are  produced  from 
the  500-mb  and  1000-mb  isobaric  fields  as  differences.  Each  of  these  nine 
fields  is  expressed  by  a  63x63  array  of  grid-point  values  oriented  as  shown 

*See  Appendix  A. 
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in  Fig.  1.  The  30  years  of  available  history  with  intervals  of  onco  daily 
and  twice  daily  covorage  is  summarized  as  follows: 


JAN  1946  -  MAR  1955 
APR  1955  -  MAR  1960 
APR  1960  -  DEC  1964 
JAN  1965  -  DEC  1975 


once  daily 
twice  daily 
once  daily 
twice  dally 


As  discussed  in  Section  8.1.2,  a  more  extensive  data  base  is  required 
to  take  full  advantage  of  RASS — in  particular,  more  frequent  analyses  are 
required  to  capture  the  small-scale  (SD)  variabilities  of  the  atmosphere. 

The  griddod  fields  of  this  data  base  are  not,  of  course,  in  the  form 
required  by  RASS;  one  of  the  tasks  (Task  1  b)  of  this  Project  was  to  construct 
a  new  history  data  base  for  use  by  RASS.  (See  Section  5.) 

3. 2. 2. 3  The  Quick  Screen 

Any  synoptic  situation  is  represented  by  the  appropriate  set  of  9 
griddod  fields  discussed  above.  The  fundamental  component  of  any  analogue 
selection  scheme  lies  In  the  techniques  used  for  representing  the  baseday 
and  analogue  candidate,  thus  allowing  comparisons  to  be  made  and  scored, 
and  analogues  selected.  Various  approaches  can  be  used,  all  of  which 
attempt  to  capture  the  essential  pattern  characteristics  of  the  historical 
fields.  Any  approach  should  take  into  account  the  variations  in  resolution 
required  by  the  different  scales  of  atmospheric  disturbance  (i.e.,  SV,  SL 
and  SD).  Also,  the  techniques  used  must  bo  rapid  enough  to  scan  through 
the  total  history  in  an  acceptable  time  without  sacrificing  (in  the  interests 
of  speed)  any  of  the  necessary  detail  required  for  effective  analogue 
selection . 

In  an  attempt  to  speed  up  the  selection  process,  an  earlier  version 
of  the  Rapid  Analogue  Selection  System  incorporated  a  preliminary  "Quick 
Screen"  process  for  producing  a  much-reduced  list  of  potential  analogue 
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The  Standard  63xt>3  Grid  Array  of  Noi  thorn-llomisphore  Coverage, 
Polar-Stereographic  Projection.  Note  that  the  grid-point 
coordinates  are  numbered  0  through  b2. 


candidates.  Having  passed  tho  Quick  Screen  criteria,  these  potential 
analogues  were  then  assigned  a  final  score  using  a  functional  measure 
which  determined  the  proportion  of  baseday  variance  explained  by  each 
analogue  candidate  (Tig.  2a).  The  Quick  Screen  technique  was  based  on 
a  special  bit-coding  of  component  fields,  designed  so  that  the  count  of 
matching  bits  (baseday  compared  with  analogue  candidate)  gave  a  measure 
of  pattern  similarity. 

The  Quick  Screen  was  found  to  be  so  fast  and  effective  at  giving 
a  preview  of  the  final  scores  that  it  was  realized  it  could  be  expanded  in 
comprehensiveness  to  do  tho  entire  job  of  analogue  selection.  Quick 
Screen  provides  absolute  measures  of  pattern  similarity  rather  than  the 
relative  measures  afforded  by  correlation  coefficients  or  our  functional 
measures.  It  can  also  give  the  regional  distribution  of  pattern  similarities 
for  any  component  characteristics  and  degrees  of  resolution.  The  flexibility 
of  the  design  readily  allows  analogue  selections  to  be  made  on  a  regional- 
focus  basis,  utilizing  data  base  subsets  from  the  full  hemispheric  coverage. 

Essentially,  this  particular  component  of  the  overall  Rapid  Analogue 
Selection  System  consists  of  an  expanded,  comprehensive,  and  very 
flexible  Quick  Screen  process.  If  presented  with  current  or  other  weather 
patterns  (including  scenarios),  it  can  scan  rapidly  the  data  base  and 
determine  whatever  similar  weather  patterns  may  have  occurred  in  a  history 
going  back  30  years.  The  search,  for  an  extensive  region  such  as  the 
Greater  Mediterranean,  can  generally  be  accomplished  in  the  order  of  three 
minutes  of  CDC-6500  CPU  time;  the  complete  history  for  the  regionally 
focused  subset  can  be  accommodated  on  one  large  reel  of  magnetic  tape. 

1  o  satisfy  Task  1  f  of  this  Project,  RASS  has  been  designed,  developed 
and  optimized  for  operational  use  on  the  FNWC  computer  system.  Program 
resource  requirements  are  well  within  the  constraints  of  operational 
specifications. 
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4. 


THE  BASIC  BIT  CODE  FOR  REPRESENTING  SYNOPTIC  PATTERNS 


4 . 1  Modular  Design 

The  Quick-Screen  bit  code,  applicable  to  gridded  fields,  is  a  scheme 
for  coding  synoptic  patterns.  The  coded  bit  strings  that  are  formed  represent 
stratified  grid-point  values,  and  the  differences  between  grid-point  values, 
in  regular  spacings  and  orderings  of  repetition  over  the  grid.  The  primary 
purpose  of  the  code  is  to  allow  easy  and  rapid  comparison  of  one  field  of 
patterns  with  another,  measuring  the  degree  of  similarity  between  these 
two  patterns  in  ranges  of  scale,  in  subregion  by  subregion,  and  in  coarse, 
medium  and  fine  degrees  of  resolution.  The  whole  Northern  Hemisphere 
can  be  covered  at  the  full  resolution  of  the  code;  however  any  subset  may 
be  extracted  to  correspond  to  a  specified  regional  focus. 

The  bit  code  is  formulated  in  terms  of  a  modularization  of  the  gridded 
fields,  a  module  consisting  of  a  4x4  array  of  grid-point  values.  The  spacing 
of  the  grid  points  used  to  form  a  module  differs  according  to  the  range-of- 
scale  inherent  in  synoptic  patterns.  Thus,  in  the  Disturbance  (SD) 
range-of-scale  the  full  density  of  the  63x63  grid  array  is  used;  in  the 
Long-wave  (SL)  range-of-scale  a  double-spaced  subset  is  used;  and  in  the 
Vortex  (SV)  range-of-scale  a  triple-spaced  subset  is  used.  These  arrays 
are  shown  in  Figs.  3,  4  and  5,  respectively.  The  numbering  of  the  modules 
extends  the  modular  concept  to  arrays  of  8x8. 

In  order  to  effect  greater  discrimination  in  the  coding  of  the  vortex 
(SV)  range-of-scale  field,  the  coding  is  applied  to  the  anomaly  of  this  field 
from  a  long-term  (annual)  mean  field;  SV  -  SV.  The  north-south  gradient 
of  the  vortex  anomaly  reverses  between  summer  and  winter,  giving  a  strong 
seasonal  discriminator.  Other  characteristics  associated  with  eccentricities 
of  the  vortex  are  also  accentuated. 


Fig.  3  Resolution  and  Coverage  for  the  Disturbance  Scale  of  Pattern 
Features — The  SD  Component  Range-of-Scalo.  The  4x4 
modules  of  the  grid  array  arc  numbered  for  identification  and 
ordering.  The  density  of  grid  points  used  is  illustrated  in 
grid-array  subset  number  1. 
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Resolution  and  Coverage  for  the  Planetary  Vortex  Scale  of  Pattern 
Features — The  SV  Component  Range-of-Scale.  The  4x4  modules 
of  the  grid  array  are  numbered  for  identification  and  ordering. 

The  density  of  grid  points  used  is  illustrated  in  grid-array  subset 
number  1 . 
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4.2 


The  Specifying  Parameters  for  Each  4x4  Modulo 


Associated  with  each  4x4  module  of  the  total  array  there  are  seventeen 
parameters  which  measure  the  pattern  characteristics  (i.e.,  value  and  shape) 
of  the  height  and  thickness  contours  affecting  that  module  for  any  given 
synoptic  situation.  This  set  of  parameters  has  been  designed  to  encompass 
the  various  scales  of  atmospheric  disturbance  and  contour  orientations  that 
could  occur  in  any  meteorological  situation.  These  seventeen  parameters 
are  shown  in  Fig.  6;  note  that  each  grid-point  value  of  the  4x4  module 
enters  into  two  of  the  seventeen  parameters. 

4 . 3  The  Bit-Code  for  the  Specifying  Parameters 

For  any  given  meteorological  situation,  each  of  these  seventeen 
parameters  will  have  a  numerical  value  of  height  or  thickness  (parameters  A 
or  B),  or  height  or  thickness  difference  (parameters  C  through  0).  A  bit-code 
is  then  assigned  to  each  parameter,  this  bit-code  defining  the  range  interval 
into  which  the  actual  measured  value  of  the  parameter  falls.  (Note  that  this 
bit-code  is  not  a  binary  code.)  These  range  intervals  are  defined  in  terms 
of  range  levels  which,  in  turn,  are  expressed  in  terms  of  a  mean  standard 
deviation,  cr  *  .  For  example,  Fig.  7  (page  22)  shows  the  range  intervals  and 
range  levels  for  parameter  A,  together  with  their  specifying  bit-codes  (bit 
elements  A1  through  A7).  The  numerical  values  of  the  range  levels  are 
tabulated  in  Section  4.4. 


More  precisely,  an  RMS  value  in  that  the  sample  means  were  close 
to,  but  not  exactly,  zero. 
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Fig.  6  The  seventeen  parameters  which  are  bit  coded  for  each  4x4  module 
of  the  grid  array  are  shown  in  five  subsets.  A  and  B  are  actual 
parameter  values  at  the  two  grid  points  indicated.  Tire  other 
parameters,  C  through  Q,  are  differences.  To  calculate  the  value 
of  any  difference  parameter,  the  value  at  the  non-lettered  end  of 
the  line  segment  is  subtracted  from  the  value  at  the  lettered  end. 
Parameter  C  alternates  in  orientation  between  even  and  odd  numbered 
modules . 


The  range  levels  which  define  the  range  intervals  are  specified  to 
the  program  which  bit-codes  the  gridded  fields,  cr  was  calculated  for  each 
of  the  nine  component  fields  based  on  a  summer  and  winter  sampling  of  each 
parameter.  The  samples  were  taken  from  all  modules  of  the  SP  and  SL 
component  fields,  but  were  confined  to  modules  3,  8,  10  and  13  of  the 
three  SV-anomaly  component  fields  in  order  to  accentuate  the  significant 
SV  variabilities  occurring  in  these  modules  compared  with  those  of  more 
southerly  modules.  The  percentages,  given  as  a  normal  expectance  of 
occurrence  for  each  range  interval  in  Fig.  7,  are  based  on  an  approximation 
to  a  normal  distribution;  they  have  not  been  verified  by  actually  counting 
occurrences.  However  this  is  not  critical  because  the  distributions  vary 
from  season  to  season  and  from  day  to  day.  Values  for  the  range  levels 
for  all  seventeen  parameters  are  given  in  Section  4.4. 

Bit-codes  are  allotted  to  the  other  sixteen  parameters  in  a  manner 
similar  to  that  described  for  parameter  A.  The  bits  assigned  to  all  seventeen 
parameters  are  composed  into  a  60-bit  word  of  code  for  t ha t  module.  How 
this  is  done  is  described  in  detail  in  Section  4.5  but,  basically,  the 
composition  of  the  word  is  so  designed  that  the  first  quarter  of  the  word 
gives  a  coarse  resolution  of  a  module  pattern,  the  second  quarter  gives  a 
medium  resolution  supplement,  and  the  remaining  half-word  gives  a  fine 
resolution  supplement.  This  composition  provides  a  very  flexible  system. 
Thus  portions  of  the  full  resolution  words  can  be  assembled  to  provide  other 
resolutions;  for  example,  a  new  word  formed  from  the  first  quarters  of  a 
block  of  4  words  gives  a  coarse  resolution  coding  of  an  8x8  module.  This 
flexibility  for  forming  subsets  of  the  basic  bit  code  is  exploited  in  regionally 
focusing  the  search  for  analogues. 

In  comparing  two  fields  to  determine  their  mutual  degree  of  pattern 
similarity,  the  two  sets  of  bit-coded  words  are  simply  matched,  one  with 
the  other.  The  count  of  the  number  of  corresponding  bits  that  match  is  an 
absolute  measure  of  the  degree  of  pattern  similarity  between  the  fields 
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associated  with  each  module.  Extending  this  concept  to  incline  all 
modules  representing  the  total  field,  it  can  be  seen  that  this  scheme  can 

give  a  word-by-word  (i.e.  ,  module-by-module)  accounting  of  pattern 

* 

similarity. 

In  Fig.  7  it  will  be  noted  that  bit  elements  A6  and  A 7  (or  B6  and  B7 
for  parameter  B)  "flip"  at  the  range  levels  corresponding  to  range  intervals 
9  and  10.  This  results  in  unwarranted  bit  matching  when  comparing 
corresponding  pairs  of  these  measures  from  two  modular  patterns.  This  will 
only  occur  when  comparing  the  most  mismatched  pairs  of  measures — for 
example,  if  pattern  x  has  parameter  A  in  range  interval  1  and  pattern  y  has 
parameter  A  in  range  interval  10,  then  bit  elements  A6  and  A7  will  match. 
However,  in  such  cases  the  entire  module  will  generally  score  very  low 
and  this  minor  detraction  is  accepted  in  order  to  create  extra  range  intervals 
without  adding  extra  bits  of  coding. 

Figure  7  shows  the  range  intervals,  range  levels  and  bit  coding  for 
parameters  A  and  B;  Fig.  8  shows  the  associated  scoring  matrix,  giving 
the  number  of  matching  bits  obtained  when  two  modules  are  compared. 

Note  the  effect  (top-right  and  bottom-left  corners  of  the  scoring  matrix) 
due  to  the  spurious  bit  matching  of  elements  A6  and  A7  or  B6  and  B7. 

Figures  9  through  14  show  similar  tabulations  for  the  other  pattern-specifying 
parameters  and  their  associated  scoring  matrices. 
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Fig.  8  The  Scoring  Matrix  for  Parameters  A  and  B.  Shown  are  the 
counts  of  the  number  of  matching  bits  in  comparing  the  bit 
code  of  range  intervals. 
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Fig.  10  The  Scoring  Matrix  for  Parameters  C,  F,  G,  H  and  I.  Shown 
are  the  counts  of  the  number  of  matching  bits  in  comparing 
the  bit  code  of  range  Intervals. 
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Specification  of  Bit  Code  for  Parameter  D.  The  Pit  Code  for 
Parameter  L  is  Similar. 
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Fig.  12  The  Scoring  Matrix  for  Parameters  D  and  E.  Shown  are  the 
counts  of  the  number  of  matching  bits  in  comparing  the  bit 
code  of  range  intervals. 
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Fig.  14  The  Scoring  Matrix  for  Parameters  J  through  Q.  Shown  are  the 
counts  of  the  number  of  matching  bits  in  comparing  the  bit 
code  of  range  intervals. 
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4.4 


Tabulation  of  Range  Levels 


As  described  in  Section  4.3  the  range  levels  are  defined  in  terms 
of  a  mean  standard  deviation,  o  ,  calculated  for  each  parameter  by  a 
sampling  technique.  The  following  tabulations  show,  for  each  of  the  nine 
component  fields,  the  numerical  values  of  the  range  levels  for  the 
seventeen  pa  •'ern-specifying  parameters. 
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Field:  SV500  (all  units  in  cm) 


A,  B: 

7930  5365 

3149 

1493 

0 

-1493 

-3149 

-5365  -7930 

C: 

9847 

5193 

0 

-5193 

-9847 

D,  E: 

12550 

7910 

3993 

0 

-3993 

-7910 

-12550 

F ,G ,H  ,  I: 

6538 

3448 

0 

-3448 

-6538 

J-Q: 

3195 

0 

-3195 

Field:  SV1000 

A,  B:  7354 

4975 

2920 

1384 

0 

-1384 

-2920 

-4975  -7354 

C: 

9860 

5200 

0 

-5200 

-9860 

D,E: 

13875 

8745 

4415 

0 

-4415 

-8745 

-13875 

F,G,H  ,1: 

6902 

3  640 

0 

-3640 

-6902 

I-Q: 

3531 

0 

-3531 

Field:  SV5-10 

A,  B:  7480 

5060 

2970 

1408 

0 

-1408 

-2970 

-5060  -7480 

C: 

9779 

5157 

0 

-5157 

-9779 

D,E: 

12670 

7986 

4031 

0 

-4031 

-7986 

-12670 

F,G,H,I: 

7709 

4066 

0 

-4066 

-7709 

I-Q: 

3226 

0 

-3226 
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Field:  SL500 


A, B:  12407  8393 

C: 

D , E:  15650 

F,G,H ,  I: 

J-Q: 


Field:  SL1000 

A, B:  5153  3486 

C: 

D,E:  7178 

F ,  G ,  H ,  I: 

J-Q: 


Field:  SL5-10 

A, B:  10977  7426 

C: 

D,E:  13972 

F,G,H  ,1: 

J-Q: 


4926 

2335 

0 

-2335 

13810 

7283 

0 

-7283 

9864 

4980 

0 

-4980 

9544 

5033 

0 

-5033 

4783 

0 

-4783 

2046 

970 

0 

-970 

5828 

3073 

0 

-3073 

4524 

2284 

0 

-2284 

4465 

2354 

0 

-2354 

2243 

0 

-2243 

4358 

2066 

0 

-2066 

11790 

6217 

0 

-6217 

8807 

4446 

0 

-4446 

8225 

4338 

0 

-4338 

4260 

0 

-4260 

-4926  -8393  -12407 

-13810 

-9864  -15650 
-9544 


-2046  -3486  -5153 

-5828 

-4524  -7178 

-4465 


-4358  -7426  -10977 

-11790 

-8807  -13972 
-8225 
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p 


- - -  1 

riold:  SP5P0 


A ,  B: 

7*180  5060 

2970 

1408 

0 

-1408 

-2970 

-5060 

C: 

8018 

4228 

0 

-4228 

-8018 

D,E: 

10058 

6340 

3200 

0 

-3200 

-6340 

-10058 

F.G.H.I: 

5704 

3008 

0 

-3008 

-5704 

J-Q: 

2520 

0 

-2520 

HoM:  6D1000 


A ,  B:  4420 

2990 

1755 

832 

0 

-832 

-1755 

-2990 

C: 

4851 

2558 

0 

-2558 

-4851 

D.E: 

6115 

3854 

1946 

0 

-1946 

-3854 

-6115 

F ,  G ,  H  ,  I: 

3933 

2074 

0 

-2074 

-3933 

J-Q: 

1680 

0 

-1680 

riold:  SP5- 10 


A, B:  6630 

4485 

2632 

1248 

0 

-1248 

-2632 

-4485 

C: 

7142 

37  66 

0 

-3766 

-7142 

D.E: 

8781 

5535 

2794 

0 

-2794 

-5535 

-8781 

F.G.H.I: 

5394 

2844 

0 

-2844 

-5394 

J-Q: 

2352 

0 

-2352 

r 
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4.5 


Formation  of  the  60-Bit  Word  for  a  4x4  Module 


As  outlined  in  Section  4.3  the  bits  allotted  to  the  seventeen 
parameters  of  a  module  are  composed  into  one  60-bit  word  of  code  for 
that  module.  The  bit  elements  are  arranged  in  such  a  way  that,  for  any 
word,  bits  1-15  give  a  coarse  resolution  of  the  module  pattern,  bits  16-30 
give  a  medium  resolution  supplement  and  bits  31-60  give  a  fine  resolution 
supplement.  The  following  table  shows  the  bit  number  location  of  each 
parameter  element: 


Bit  Position  in 
the  60-Bit  Word 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 


Parameter  Element 
Located  There 

A1 

A3 

AS 

Bl 

B3 

B5 

Cl 

C2 

C3 

D2 

D4 

D5 

E2 

E4 

E5 

A6 
B6 
C4 
FI 
F2 
F3 
G 1 
G2 
G3 
HI 
H2 
H3 

11 

12 
13 


Bits  1-15  provide 
coarse  pattern 
resolution 


Bits  16-30  provide 
a  medium  pattern 
resolution  supplement 
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Bit  Position  in 
the  60-Bit  Word 


Parameter  Element 
Located  There 


Bits  31-60  provide 
a  fine  pattern 
resolution 
supplement 


L 
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It  is  instructive  to  follow  the  construction  of  a  60-bit  word  from 
the  seventeen  contributing  parameters. 

For  any  given  parameter,  say  parameter  A,  there  exists  a  "library" 
of  ten  60-bit  words,  one  for  each  of  the  ten  range  intervals  associated  with 
parameter  A.  The  range  interval  is  determined  by  progressive  tests  on  the 
actual  numerical  measure  of  parameter  A  (see  Section  4.4)  and  the  appropriate 
word  representing  this  range  interval  is  selected  from  the  library.  For 


example ,  the  library  for  parameter  A  is 

as 

follows: 

Bit  Positions: 

1 

2 

3 

4-15 

16 

17-30 

31 

32 

33 

34-60 

Bit  Elements  : 

Al 

A3 

A5 

A6 

A2 

A4 

A7 

Ranae  Interval 

© 

0 

0 

0 

0-0 

0 

0-0 

0 

0 

0 

0-0 

© 

0 

0 

0 

0-0 

0 

0-0 

0 

0 

1 

0-0 

© 

0 

0 

0 

0-0 

1 

0-0 

0 

0 

1 

0-0 

© 

0 

0 

1 

0-0 

1 

0-0 

0 

0 

1 

0-0 

© 

0 

0 

1 

0-0 

1 

0-0 

0 

1 

1 

0-0 

© 

0 

1 

1 

0-0 

1 

0-0 

0 

1 

1 

0-0 

© 

0 

1 

1 

0-0 

1 

0-0 

1 

1 

1 

0-0 

© 

1 

1 

1 

0-0 

1 

0-0 

1 

1 

1 

0-0 

© 

1 

1 

1 

0-0 

1 

0-0 

1 

1 

0 

0-0 

© 

1 

1 

1 

0-0 

0 

0-0 

1 

1 

0 

0-0 

The  above  library  should  be  compared  with  Fig.  7 — note  that  the  bit 
columns  have  been  rearranged  to  separate  Al,  A3,  A5  (coarse  pattern 


resolution),  A6  (the  medium  resolution  supplement),  and  A2 ,  A4,  A7  (the 
fine  resolution  supplement). 

A  similar  process  is  followed  for  the  remaining  16  parameters.  The 
60-bit  word  representing  the  module  initially  has  all  bits  set  to  zero  and 
is  then  formed  by  accrual  of  the  17  contributary  60-bit  words  representing 

the  range  intervals  of  the  parameters.  The  total  library  contains  98  words, 
made  up  as  follows: 


A  and  B 

10  each 

total 

20 

C,  F,  G,  H  and  I 

6  each 

total 

30 

D  and  E 

8  each 

total 

16 

J,  K,  L,  M,  N,  O,  P  and  Q 

4  each 

total 

32 

98 

Section  3.2.1  laid  down  the  four  basic  components  of  any  analogue 
forecasting  scheme.  It  can  be  seen  that  the  first  of  these— a  methodology 
for  interpreting  any  meteorological  event  in  terms  of  relevant  specifying 
parameters— is  accomplished  by  the  RASS  techniques  described  in 
Section  4. 
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5. 


proihiction  or  tui:  lUT-oonn't  history 


Tho  second  basic  component  of  any  analogue  selection  system  \soo 
Section  3.2.  1)  is  the  production  ot  a  data  ba.se  in  tonus  of  the  specifying 
parameters  used  to  represent  any  meteorological  events. 

1  In'  source  data  sot  consists  ot  about  lt>0  large  reel  magnetic  tapes 
and  production  of  the  bit-coded  history  was  accomplished  in  two  sepaiate 
phases .  I  he  first  step  was  to  extract  and  oi  oonorato  the  required  nine 
component  fields  (see  Section  3. 2. 2. 2)  from  this  source  data  set.  Then 
these  fields  were  processed  into  the  full  hemispheric  bit  code  described 
In  Section  4. 

Iho  coded  data  is  organised  on  S  large  reel  magnetic  tapes  and  is 
the  source  for  generation  of  any  regional  subset  of  the  data. 


KFGIONA1,  FOCUS  OAPAMI.1T1FS 


This  Section  contains  a  description  of  the  methods  used  foi  tocusino 
on  a  selected  region;  the  Greater  Mediterranean  will  be  used  as  an  example'. 
The  techniques  involved  in  searching  foi  and  selecting  analogues  llho  third 
basic  component  of  an  analogue  system--see  Section  3.2.1)  are  described 
in  Section  7. 

6.1  General  Approach 

As  described  in  Section  4.1  the  bit-code  is  formulated  In  terms  ot 
a  modularization  of  the  grldded  fields,  a  module'  consisting  ot  a  4x4  array 
of  grid-point  values  with  the  grid-point  spacing  being  dependent  on  the 
3  inherent  ranges-of-scale. 

On  the’  SP  ronge-ot-scnle  the  Northern  Hemisphere  is  covered  by 
144  modules  (see  Pig.  3),  the  SI.  rango-of-scnlo  by  36  modules  (see 
Pig.  4),  and  the  SV  range-of-scale  by  l  (>  modules  (see  Fig.  5).  bach 
range-of-scnle  involves  three  fields  (1000-mb,  500-mb,  500-1000 -mb 
thickness)  and  thus  a  bit  string  ot  566  words  is  requited  tv'  represent  the 
nine  component  fields  of  each  synoptic  situations;  i.e.,  each  date-time 
group  in  the  bit-coded  history. 

Subsets  of  this  coded  history  may  be  extracted  to  suit  any  purpose. 

A  subset  extracted  fora  specific  region,  such  as  the  Greatei  Moditeiranoan, 
constitutes  a  "regional  focus"  subset  of  the  data.  As  discussed  in  Sections 

6.2  and  6.3,  the  required  resolution  should  be  taken  into  account  when 
compiling  a  regional  focus  data  base. 

From  the  regional  tocus  subset  of  the  bit-coded  history,  specific 
date-time  groups  may  be  selected  and  combined  to  produce  a  bit-string 
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representing  a  regional  focus  scenario.  Such  a  combination  may  bo  expressed 
by 


SC(T-1)  -»  r  S(T-1)  +  ST 


where  the  scenario  taking  place  from  time  (r-1)  to  r  ,  SC 
a  combination  of  the  situation  S  at  (r-1)  and  r  . 


(r-1) 


( r )  ' 


is 


A  search  for  analogues  similar  to  the  baseday  scenario  requires, 
of  course,  that  analogue  candidate  scenarios  and  the  baseday  scenario, 
are  both  bit-coded  in  the  same  way;  the  similarity  score  between  the 
baseday  scenario  and  a  particular  analogue  candidate  scenario  can  then 
be  based  on  a  count  of  the  matching  bits. 

The  concept  of  scenario  coding,  matching,  and  analogue  selection 
is  discussed  in  greater  detail  in  Section  7. 


6.2  Standardized  Approach 

A  standard  procedure  has  been  devised  for  specifying  a  regional 
focus  and  for  extracting  the  required  bit-coded  data  subset.  This  standard 
procedure  allows  a  selected  list  of  module  numbers  to  be  specified  for  each 
of  the  component  fields,  this  list  depending  on  the  particular  region  of 
interest.  (The  reason  for  specifying  module  numbers  for  each  component 
field  is  to  provide  greater  realism  and  flexibility— the  modules  required 
for  representation  of  one  range-of-scale  are  not  generally  the  same  as  for 
other  ranges-of-scale.)  The  degree  of  resolution  required  for  every  listed 
module  has  also  to  be  specified:  for  coarse  resolution  the  first  quarter  of 
the  bit-coded  word  for  that  module  is  extracted;  for  medium  resolution  the 
first  half  of  the  word  is  extracted;  and  for  fine  resolution  the  full  word  is 


-40- 


extracted.  The  resulting  subset  for  each  component  field  is  rearranged 
into  three  groups  of  words: 

a.  Coarse-resolution  words  formed  by  stringing  together  all 
quarter-words  which  had  resided  in  the  upper  quarter  word 
before  extraction; 

b.  Medium-resolution-supplement  words  formed  by  stringing 
together  all  quarter-words  which  had  resided  in  the  second 
quarter-word  before  extraction; 

c.  Fine-resolution-supplement  words  formed  by  stringing  together 
all  half-words  which  had  resided  in  the  second  half  of  the  word 
before  extraction. 

(Zeroes  are  used  as  necessary  to  complete  the  last  word  of  each  group  so 
formed. ) 

6. 3  The  Creator  Mediterranean  Focus 


The  standardized  approach  outlined  in  Section  6.2  has  been  applied 
to  the  Greater  Mediterranean  region.  The  regional  focus  is  specified  by 
range-of-scale  component  fields  and  Tigs.  15,  16  and  17  show  the  focus 
for  the  SD,  SL  and  SV  fields  respectively.  Extractions  are  made  only  for 
the  modules  where  numbers  have  been  circled.  A  double  circle  Indicates 
fine  resolution,  a  double  bar  under  the  number  indicates  medium  resolution, 
and  a  single  bar  indicates  coarse  resolution  for  that  module. 

To  construct  the  specification  list  (see  Section  6.2)  only  the  lowest 
module  number  of  each  group  of  four  modules  is  listed;  e.g.  ,  specifying 
module  number  37  automatically  incorporates  modules  37  through  40. 
Following  a  module  number  a  codo  is  used  to  specify  the  resolution  required 
for  each  of  the  four  associated  modules  in  numerical  order.  The  resolution 
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TXT 


Greater-Mediterranean  Focus  for  the  SD  Component  fields. 
Extractions  are  made  only  for  the  circled-numbor  modules. 

A  double  circle  indicates  fine  resolution.  A  double  bar  undo 
the  number  indicates  medium  resolution.  And  a  single  bar 
under  the  number  indicates  coarse  resolution  for  that  module 
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Greater-Mediterranean  Focus  for  the  81,  Component  Fields. 
Extractions  are  made  only  for  the  circled-number  modules. 

A  double  circle  indicates  fine  resolution.  A  double  bar  under 
the  number  indicates  medium  resolution.  And  a  single  bat- 
under  the  number  indicates  coarse  resolution  for  that  module. 
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Creator-Mediterranean  Focus  for  the  SV  Component  Helds. 
Extractions  are  made  only  for  the  circled-number  modules. 

A  double  circle  indicates  fine  resolution.  A  double  bar  under 
the  number  indicates  medium  resolution.  And  a  single  bar 
under  the  number  indicates  coarse  resolution  for  that  module. 
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code  is  1  for  coarse,  2  for  medium,  and  4  for  fine  resolution.  Thus,  for 
example,  a  specification  list  element  given  as  65:2244  is  interpreted  as 


module  65  —  medium  resolution 

66  --  medium  resolution 

67  —  fine  resolution 

68  --  fine  resolution  . 


From  Figs.  15,  16  and  17  the  following  specification  list  for  the 
Greater  Mediterranean  Region  may  be  constructed: 


SD  Fields 

SL  Fields 

37: 

1111 

5: 

0011 

41: 

1111 

9: 

0002 

45: 

0001 

17: 

1221 

61: 

1221 

21: 

4114 

65: 

2244 

29: 

0100 

69: 

2002 

33: 

1000 

85: 

1210 

89: 

4442 

93: 

4002 

113: 

1200 

117: 

1000 

SV  Fields 


5:  0044 
13:  4400 


Rearrangement  produces  the  following  groups  of  words: 


For  an  SD  Field 

8  coarse 
4  medium 
3  fine 


For  an  SL  Held 

4  coarse 
2  medium 
1  fine 


For  an  SV  Field 

1  coarse 

1  medium 

2  fine 
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Thus,  for  the  Greater  Mediterranean  region,  an  SO  field  requires 
15  words,  an  SL  field  requires  7  words,  and  an  SV  field  4  words — a  total 
of  78  words  for  all  nine  component  fields.  To  produce  the  data  subset 
these  78  words  are  extracted  and  formed  from  tho  full  set  of  588  words 
for  each  date-time  group. 

In  general,  the  production  of  a  regional-focus  data  subset  is 
accomplished  in  a  single  computer  production  run  using  the  full  data  set 
of  8  tapes  as  input.  The  number  of  output  tapes  generated  for  any  subset 
depends,  of  course,  on  the  size  of  the  region  being  considered  but, 
typically,  this  would  be  a  single  tape. 


-46- 


7. 


THE  ANALOGUE  SEARCH  AND  SELECTION  PROCESS 


7. 1  Introduction 

Section  7  presents  and  discusses  the  Mil  methodology  for  providing 
the  third  basic  component  of  any  analogue  selection  system  laid  down  in 
Section  3.2.1. 

7 . 2  Preparing  the  Daseday 

As  defined  in  Section  3.1,  the  baseday  may  be  specified  as  either 
the  current  synoptic  situation  or  the  synoptic  situation  corresponding  to 
any  date-time  group  in  the  history.  Clearly,  if  the  current  situation  is 
chosen,  then  analogue  selection  must  begin  with  the  bit-coding  of  its 
component  fields  and  extraction  of  the  regional-focus  elements.  For 
scenario  matching  using  the  current  situation,  the  appropriate  bit-strings 
must  be  formed  incorporating  the  time  element. 

7 . 3  The  Scoring  Matrix 

Each  regional-focus  subset  representing  a  single  synoptic  situation, 
consists  of  27  subgroups  of  words,  thus: 

3  ranges-of-scale  (SV,  SL,  SD) 

x  3  contour  values  (1000-mb,  500-mb,  500-1000-mb  thickness) 

x  3  resolutions  (coarse,  medium,  fine). 

Each  analogue  candidate  from  the  history  is  scored  by  comparing  its 
subset  to  that  of  the  baseday,  the  count  of  the  number  of  matching  bits 
being  a  measure  of  the  similarity  between  the  patterns  corresponding  to  the 
two  situations.  The  counting  of  matching  bits  proceeds  in  stages, 
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commencing  with  coarse  resolution  words,  then  medium  resolution  words, 
and  ending  with  fine  resolution  words.  At  each  stage  there  is  a  "gate" — if 
the  number  of  matching  bits  does  not  reach  an  assigned  minimum  level, 
then  the  analogue  candidate  is  rejected  at  that  stage.  This  procedure 
speeds  up  the  selection  process  considerably. 

The  counting  of  matching  bits  and  gate  checks  for  minimum  counts 
proceeds  as  follows: 

Count  Coarse  SV500 

"  "  SV1000 

"  "  SV5-10 

"  "  SL500 

"  "  SL1000 

"  "  SL5-10 

”  "  SD500 

"  "  SD1000 

"  "  SD5-10 

Total  the  Coarse  counts 

Count  Medium  SV500 
"  "  SV1000 

"  "  SV5-10 

"  "  SL500 

"  "  SL1000 

"  "  SL5-10 

"  "  SD500 

"  "  SD1000 

"  "  SD5-10 


Total  the  Medium  counts 
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Count  Fine  SV500 

"  "  SV1000 

"  "  SV5-10 

"  "  SL500 

”  H  SL1000 

"  "  SL5-10 

"  "  SD500 

"  "  SD1000 

"  "  SD5-10 


Total  the  SV500  counts 
Total  the  SV1000  counts 
Total  the  SV5-10  counts 
Total  the  SL500  counts 
Total  the  SL1000  counts 
Total  the  SL5-10  counts 
Total  the  SD500  counts 
Total  the  SD1000  counts 
Total  the  SD5-10  counts 
Total  the  Fine  counts 


In  the  above  procedure  for  counting  matching  bits  there  are  39  gates, 
each  of  which  must  be  passed  by  an  analogue  candidate  before  being 
considered  for  the  next  stage  in  the  selection  process.  In  addition  to  a 
listing  of  gate  values  to  be  exceeded,  the  system  contains  a  listing  of 
weight  factors  to  be  applied  to  the  actual  scores  achieved  at  each  stage. 
These  weights  can  be  adjusted  (tuned)  to  emphasize  any  desired  feature  or 
combination  of  features  (i.e.,  range-of-scale,  level,  thickness,  resolution). 
A  final  count  is  then  made  which  is  the  weighted  total  of  the  39  contributing 
counts — this  final  count  is  the  "analogue  score"  and  is  used  to  rank  the 
selected  analogues.  If  the  number  of  analogues  selected  does  not  reach  a 
specified  minimum,  e.g. ,  100,  then  the  selection  process  is  repeated 
after  lowering  all  gates  by  10%.  For  uncommon  basedays  this  process  may 
have  to  be  repeated  more  than  once. 
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Comparison  with  Monthly-Mean  Fields 


From  the  historical  data,  montnly— mean  hemispheric  climatologies 
have  been  compiled  for  all  nine  component  fields  in  bit-coded  format.  A 
climatological  regional  focus  subset  may  be  extracted  for  any  region;  one 
such  subset  has  been  extracted  for  the  Greater  Mediterranean.  The 
climatic  group  for  the  month  of  the  baseday  is  forced  past  all  gates,  thus 
enabling  its  (weighted)  final  score  to  be  used  for  reference  purposes. 

7 •  5  Probability  Considerations 


In  any  study  and  design  of  an  analogue  system  it  is  of  interest  to 
consider  the  effects  of  chance  in  determining  the  degree  of  similarity 
obtained  between  a  baseday  and  an  analogue  candidate.  For  RASS,  a  very 
simple  model  will  be  presented,  based  on  modularization  and  bit-coding 
concepts . 

Consider  a  parameterization  scheme  which  enables  the  pattern  over 
a  module  to  be  represented  by  a  string  of  n  bits.  Each  bit,  of  course,  can 
have  only  one  value — either  0  or  1 .  Assume  that  there  exists  a  very  large 
data  base,  containing  a  wide  range  of  variabilities,  in  this  bit-coded 
format.  Then  under  these  conditions,  selecting  any  two  situations  at 
random  and  counting  the  matching  bits  should  give  a  result  in  agreement 
with  the  laws  of  probability  regarding  random  events . 

From  Bernoulli's  formula: 


P  (B) 
n 


n !  B  n-B 
B !  (n-B) !  P  q 


where  P^fB)  is  the  probability  that  an  event  will  occur  exactly  B  times  out 
of  n  trials;  p  is  the  probability  of  the  event  occurring,  common  to  each 
trial;  and  q  is  the  probability  of  the  event  not  occurring,  i.e.,  q  =  1-p. 
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Matching  2  n-bit  words  is  equivalent  to  n  trials  whore  p  =  q  =  0.5. 
Substituting  for  p  and  q  wo  obtain 


P  (B) 
n 


n  1  (0, 5)n 
B I  (n-B) ! 


whore  P  (it)  may  be  regarded  as  the  probability  that  B  bits  will  match  out 

n 

of  a  bit-string  of  n  bits.  Figure  It)  shows  curves  of  P  (B)  against  B  for 
various  values  of  n  . 

The  main  feature  of  noto  is  that  as  n  increases,  the  probability  of 
obtaining  a  chance  match  of  other  than  about  50%  of  the  bits  becomes  very 
small — or,  to  put  it  the  other  way,  it  is  very  likely  that  about  50%  of  the 
bits  will  match  by  chance. 

In  an  analogue  system  such  as  KASS  which  utill/.es  bit-matching 
(BO  bits  per  module)  the  arrangement  of  bits  within  the  bit-string  is  not 
completely  random  for  several  reasons.  For  example,  the  range  levels 
for  the  specifying  parameters  are  based  on  a  distribution  obtained  by 
sampling,  and  there  is  a  methodology  for  bit-coding  the  range  Intervals. 
Also,  for  an  area  such  as  the  Mediterranean,  pressure  is  generally  higher 
in  the  south  than  in  the  north  and  this  will,  on  average,  be  reflected  in 
the  number  of  bits  matching  by  chance. 

Thus  on  average  it  would  be  expected  that,  when  matching  two  60-blt 
BASS  modules,  rather  more  than  30  bits  would  match  by  chance.  It  is  not 
possible  to  calculate  the  actual  average  because  of  the  complex  interactions 
Involved — inherent  in  the  KASS  system  itself,  and  in  the  meteorological 
situations  and  patterns.  However  such  an  average  for  a  specified  module  or 
complete  regional  focus  can  be  determined  by  experiments.  Knowledge  of 


this  average  determines  the  "zero  skill"  level  of  RASS  matching*  and  is 
required  for  the  setting  of  the  selection  gates  (see  Sections  7.3  and  R.  1). 

7 •  h  A  Scheme  for  Scenario-Matching 

7.6.1  The  Time  Tunnel 

Section  6.1  described  in  simple  terms  how  a  bit-string  representing 
a  scenario  is  compiled  in  RASS.  The  equation  given  in  that  section  can  be 
generalized  to  cover  a  scenario  of  any  length  in  time: 

x  n 

Sk'(r- n) -*  r  ~  2Z  lSr-x 
x  =  0 

where  x  and  n  are  measured  in  units  of  the  time  increment  of  the  data 
base. 

It  is  edifying  to  compare  analogue  and  scenario  matching  by  a 
simple  pictorial  technique.  A  meteorological  situation  S  can  be  imagined 
as  a  point  in  N-space  whore  N  is  the  number  of  specifying  parameters. 
The  evolution  of  S  with  time  may  be  illustrated  thus: 


time 


where  S  lias  been  shown  at  a  particular  point  in  time.  It  S  at  this  point 
in  time  is  vised  as  a  basis  for  analogue  selection  then,  of  course,  the 
time  Is  that  of  the  baseday. 


It  is  interesting  to  note  that  because  of  the  bias  toward  matching, 
it  is  more  difficult  to  find  very  bad  analogues  than  very  good  analogues. 
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S  would  only  be  a  point  in  N-space  if  the  precise  value  of  the 
specifying  parameters  were  both  known  and  used.  However  the  technique 
of  using  range  levels  for  coding  the  specifying  parameters  introduces 
uncertainty  and  S  should  be  represented  by  a  blob  rather  than  a  point, 
thus: 


time 


Even  allowing  for  the  uncertainty  in  S  ,  a  precise  match  is  most 
unlikely  to  be  found.  In  general,  analogue  candidates  are  scored  and 
ranked,  and  the  top-scoring  analogues  are  selected.  The  maximum  number 
of  mismatching  bits  allowed  before  an  analogue  candidate  is  rejected 
describes  a  "volume"  V  in  N-space  about  S  ,  thus: 


In  analogue  selection,  a  baseday  is  chosen  and  then  the  history 
is  searched  for  meteorological  situations  whose  evolution  in  time  with 
respect  to  S  passes  through  V ;  candidates  not  passing  through  V  (the 
vast  majority)  are  rejected: 
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An  analogue  forecast  at  time  (T  +  l)  for  meteorological  situation  S 
occurring  at  time  t  is  a  compilation  of  all  analogues  passing  through  V  , 
the  compilation  being  performed  on  the  analogue  situations  one  time  period 
later  than  when  they  passed  through  V. 

One  point  is  immediately  apparent  from  the  above  diagrams — the 
best  analogue  at  time  t  is  not  necessarily  the  analogue  situation  which 
will  be  closest  to  the  evolution  of  S  at  time  T  +  l.  Thus  an  analogue 
forecasting  system  based  on  the  single  best  analogue  at  time  r  (the 
deterministic  approach)  is  not  likely  to  be  consistently  successful;  a 
compilation  of  a  "reasonable"  number  of  analogues  is  required  (the 
probabilistic  approach).  It  may  also  be  noted  that  the  closest  match (es) 
at  time  T+l  may  lie  outside  V  at  time  t  ,  and  will  therefore  not  be 
included  in  the  compilation.  However  it  is  not  possible  to  recognize 
these  cases  in  advance  and  an  analogue  forecast  system  assumes  that  the 
evolution  of  analogues  passing  through  V  will  more  closely  resemble  the 
evolution  of  S  than  analogues  not  passing  through  V  . 

The  diagrams  shown  above  may  be  extended  to  illustrate  scenario 
matching  where,  instead  of  an  acceptable  match  at  T  only,  an  acceptable 
match  over  a  period  of  time  is  required.  The  following  diagram  is  self- 
evident — to  be  successful,  analogue  candidates  must  enter  and  pass  through 
a  "time-tunnel": 
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The  above  diagram  requires  that  the  match  be  maintained  over  two 
time  Intervals,  from  T-2  to  t  .  This  is  a  two-period  scenario  match. 
Matches  over  longer  periods  may  be  obtained. 

Note  that  a  cylindrical  tunnel  requires  that  the  number  of  matching 
bits  remains  within  V  for  the  whole  time  period.  It  is  more  realistic  to 
accept  a  greater  number  of  analogues  at  the  start  of  the  scenario  matching 
process,  the  selection  criteria  becoming  relatively  more  stringent  as 
baseday  is  approached.  The  diagram  now  becomes: 


Only  those  analogue  candidates  entering  the  time  "funnel"  at  T-2 
and  remaining  within  to  emerge  at  time  t  are  used  to  compile  the  forecast 
for  time  t +1 . 

The  effect  of  a  funnel  may  be  achieved  in  a  variety  of  ways ,  an 
obvious  method  being  to  set 


An  alternative  approach  is  to  base  the  funnel  on  tange-of-scale  and  pattern 
resolution  considerations,  using  large-scale  features  and  coarse  rcolution 
at  first,  then  emphasizing  smaller-scale  features  and  finer  resolution  as 
baseday  is  approached.  A  method  for  doing  this  is  described  in  the 
following  Section. 
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7.6.2  Coupling  RASS  Forecasts  to  Numerical  Forecast  Models 

The  pictori*'  technique  developed  above  may  be  used  to  illustrate 
a  technique  for  making  fuller  use  of  the  skill  inherent  In  a  numerical 
forecast  model,  both  by  improving  RASS  forecasts  and  by  extending  the 
usefulness  of  the  numerical  model. 

Considering  the  meteorological  situation  S  at  time  t  as  being 
the  current  situation,  scenario  matching  from  say  T-48  hours  to  r  can 
be  carried  out  as  previously  described.  Now  assume  that  a  PE  (or  other) 
model  is  available  which  demonstrates  useful  skill  out  to  r+48  hours. 
The  forecast  situations  from  this  model  can  be  used  to  extend  the  "time 
funnel"  into  the  future,  thus  making  use  of  the  skill  in  the  PE  model  to 
select  analogues  for  times  greater  than  t+48  hours.  Thus: 


evolution  of  S  evolution  of  S 


Only  analogue  candidates  which  remain  within  V  for  the  whole 
range  of  T  (+  48  hours)  are  used  to  compile  analogue  forecasts  for  forecast 
times  greater  than  48  hours.  Note  that  the  above  diagram  need  not  be 

symmetrical.  For  example,  V 4  and  ^rr48  neec*  not  '3e  eclua^  to  ^7.24 

and  V  respectively,  and  neither  do  equal  periods  about  the  baseday 

T— 4  8 

have  to  be  used.  In  fact,  the  more  confidence  that  can  be  placed  in  the 
PE  model,  the  less  vT+24  and  Vt+48  should  be* 
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A  method  for  producing  a  bit-string  to  select  analogue  scenarios 
based  on  the  known  past  evolution  of  S  and  its  forecast  future  evolution  is 
shown  below.  (This  method  is  part  of  the  overall  design  of  RASS  and  its 
use  is  demonstrated  in  the  two  examples  discussed  in  Section  8.) 


Key:  Q  Coarse  resolution  only 

Q  Coarse  resolution  plus  medium  resolution  supplement 

A  Full  resolution  (coarse  plus  medium  and  fine  resolution 
supplements) 


Note  that  the  ranges-of-scale  (3)  utilized  in  the  bit  string  depend 
on  time,  as  do  the  degrees  of  resolution  (3)  employed. 

It  is  considered  that  the  ability  to  couple  analogue  scenarios  to  a 
numerical  forecast  model  such  as  the  FNWC  PE  model  is  a  unique  and 
particularly  significant  development.  Not  only  does  the  technique  promise 
to  allow  the  information  provided  by  the  PE  model  to  be  usefully  extended 
by  several  days,  but  it  should  also  allow  the  deterministic  nature  of  the 


PE  forecast  to  bo  converted  into  probabilistic  terms.  In  other  words,  if 
the  deterministic  result  of  the  PE  model  is  regarded  as  the  most  likely 
evolution  from  the  current  situation,  then  a  selection  of  appropriate 
analogues  will  allow  other  but  less  likely  evolution  possibilities  to  be 
determined.  Such  a  capability  is  of  considerable  operational  significance 
and  of  direct  relevance  to  the  use  of  operational  analysis  techniques  for 
planning  purposes.  However  at  this  stage  of  RASS  development  the  many 
potential  uses  of  scenario  matching  have  yet  to  be  explored  and  exploited. 
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8.  TUNING  AND  VERIFICATION  PROCEDURES 

8 . 1  Tuning 

There  are  basically  two  sets  of  tuning  controls — the  selection  gates 
and  the  weight  factors  assigned  to  the  number  of  matching  bits  achieved 
by  an  analogue  candidate  at  each  phase  of  the  selection  process.  As 
discussed  in  Section  7.3  and  as  presently  used  in  RASS,  essentially  the 
selection  gate  levels  control  the  number  of  analogues  selected  while  the 
weights  decide  the  final  ranking  by  adjusting  the  relative  significance  of 
any  chosen  pattern  characteristics. 

8.1.1  The  Selection  Gates 

In  carrying  out  the  selection  process  it  is  important  to  select  a 
"reasonable"  number  of  analogues.  If  too  many  are  scored,  selected  and 
ranked,  computer  resources  are  being  expended  unnecessarily,  and  if  too 
few  are  selected  the  process  has  to  be  repeated  after  lowering  tire  gate 
levels — which  again  wastes  computer  time.  (It  is  not  possible  to  know  how 
many  analogues  will  be  selected  for  an  arbitrary  baseday.  The  selection 
process  could  be  stopped  once  a  chosen  number  is  reached  but  this  is  not  a 
realistic  approach  as  the  best  analogues  may  not  have  been  reached  in  the 
search. ) 

Selection  of  this  "reasonable"  number  has  to  be  based  on  knowing, 
on  average,  how  many  analogues  will  be  selected  from  the  appropriate 
season  for  a  randomly-chosen  baseday.  Selection  gates  are  set  so  that  this 
average  number  is  "reasonable".  Of  course,  if  any  particular  baseday  is  a 
commonly-occurring  situation  then  a  larger  number  will  be  selected,  and 
vice  versa  .  Determining  this  average  number  of  selections  involves  those 
considerations  discussed  in  Section  7.5. 


The  fact  that  a  very  low  number  of  analogues  is  selected  for  a  given 
baseday  is  information  of  value  in  that  it  informs  that  an  unusual  event  is 
occurring. 
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8. 1.1.1  Persistence 


A  method  for  arriving  at  an  approximation  to  this  average  number  of 
analogues  to  an  arbitrary  baseday  is  to  select  a  small  number  of  basedays 
and  match  them  (by  counting  matching  bits)  against  their  own  evolution. 
Thus  if  ST  is  the  baseday  and  n  is  an  integer  number  of  days  the  procedure 
may  be  expressed  by 


where  "  :  "  indicates  the  process  of  counting  matching  bits.  In  effect  this 
procedure  detects  the  persistence  of  ST  out  to  15  days. 

Figures  19,  20  and  21  show  the  persistence  out  to  15  days  of  the 
nine  component  fields  using  all  31  days  of  January  1967  as  basedays. 

Similar  curves  are  discussed  in  greater  detail  in  Section  8.2.  However 
with  regard  to  selection  gates,  if  it  is  assumed  that  there  is  zero 
persistence  after  15  days  (i.e.,  that  ST+^  is  independent  of  Sr )  and  that 
January  1967  was  a  "typical"  January,  then  the  match  coefficient  at  ST+^ 
gives  a  measure  of  the  number  of  bits  likely  to  match  by  chance  in  all 
Januaries . 

To  obtain  this  measure  correctly  for,  say,  January,  requires  matching 
two  randomly  selected  situations  from  all  Januaries  in  the  data  base, 
repeating  this  process  a  large  number  of  times,  then  taking  a  mean  of  the 
count  of  matching  bits.  However,  for  the  purposes  of  setting  selection 
gate  levels  the  approximate  process  has  been  found  satisfactory  with  regard 
to  selection  of  a  reasonable  number  of  analogues  (see  also  Section  8.2). 


Figure  20  January  1967  1000-mb  Persistence 


« .  1 . 2  Weight  Factors 

Suitable  choice  of  the  weighting  factors  applied  to  the  number  of 
matching  bits  at  each  stage  of  the  selection  process  allows  the  relative 
significance  of  any  input  component  to  the  final  score  to  be  controlled. 
There  are  9  of  these  components — 3  ranges-of-scale  x  3  degrees  of 
resolution.  Basically  (and  obviously)  the  smaller  the  rnngo-of-scale  and 
the  finer  the  degroo  of  resolution,  the  more  difficult  it  is  to  obtain  good 
ana logues . 

The  three  ranges-of-scale  are  adequate  to  represent  disturbances 
of  the  atmosphere  in  space.  However,  associated  with  each  range-of-scale 
there  is  a  range-in-time;  SV  disturbance  components  vary  slowly,  SL 
components  more  quickly,  and  SD  components  vary  rapidly.  To  capture 
time-variabilities  on  the  SV  scale,  SV  analyses  in  the  data  base  should 
be  at  Intervals  of  1  or  2  days;  the  available  data  base  contains  SV  analyses 
with  this  frequency  (see  ’action  3. 2. 2. 2).  Tor  SL  analyses,  the  analysis 
frequency  should  be  every  12  hours;  the  available  data  base  is  adequate 
for  some  periods  of  the  history  but  not  for  all  the  history.  However,  to 
capture  time-variabilities  on  tho  SD  scale,  SD  analyses  are  required 
ovory  6  hours  with  an  interpolation  capability  down  to  1  hour;  in  this 
respect  the  data  base  is  completely  Inadequate. 

To  illustrate  the  effect  of  this  lack  of  resolution-in-time  on  the  SD 
range-of-scale,  imagine  that  good  analogues  for  a  particular  baseday  have 
been  found  in  the  SV  and  SI.  ranges-of-scale.  Because  SD  analyses  are 
only  available  at  12-hour  or  24-hour  intervals,  they  will  appear  to  be 
scattered  almost  randomly  through  these  analogues  and  their  subsequent 
evolution,  in  fact,  based  on  24-hour  analyses ,  the  SD  range-of-scale 
(in  time)  appears  as  "noise".  A  forecaster  requires  synoptic  analyses 
every  6  hours  for  a  large  area  and  every  3  hours  for  local-area  forecasting; 
this  requirement  is  no  less  critical  for  BASS  which  matches  synoptic 
situations  and  their  ovolutions  in  space  and  time. 
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Iho  need  to  Interpolate  SD  features  to  a  time-resolution  of  1  hour 
is  to  provide  a  "phase-matching"  capability.  Tor  example,  an  analogue 
may  match  the  baseday  situation  very  well  with  regard  to  the  SV  and  SI. 
features,  but  the  evolution  of  analogue  SD  features  may  lead  or  lag  those 
of  the  baseday  by  a  small  number  of  hours.  The  analogue  should  therefore 
be  adjusted  to  correspond  to  the  "phase"  of  the  baseday— a  feasible 
process  given  sufficient  time-resolution  in  the  data  base. 

SD  features  are  largely  responsible  for  operationally  significant 
weather  factors  and  therefore  their  importance  should  be  reflected  in 
analogue  selection  and  ranking.  However  at  this  time,  due  to  the  lack  of 
resolution  in  the  data  base  along  the  time  axis,  very  little  weight  can  be 
given  to  the  SD  range-of-scale .  Given  the  currently  available  data  base, 
the  SL  range-of-scale  is  the  smallest  that  can  be  matched  with  any  degree 
of  success.  Therefore  the  weight  factors  assigned  to  SL  fields  are 
accentuated  accordingly. 


Persistence  Cllniatoloo i e s 


Section  8. 1.1.1  discussed  the  relevance  of  persistence  in  establishing 
selection  gate  levels  to  ensure  that  an  adequate  but  reasonable  number  of 
analogues  are  chosen  in,  ideally,  one  pass  through  the  available  history. 
However  the  main  use  of  persistence  scores  is  to  establish  a  "zero  skill" 
level  against  which  to  compare  the  effectiveness  of  PASS;  a  variety  of 
climatologies  has  been  derived  for  this  purpose. 

In  all  climatologies  the  formulation  given  previously  has  been  used 
out  to  15  days;  i.e.  , 


S„  :  S„ 
r  r+n 


where  t  assumes  a  range  of  values  depending  on  the  climatology  required. 
For  example  to  derive  an  all- Januaries  climatology,  T  covers  the  range 
1  31  for  all  Januaries  in  the  data  base.  Note  that  a  January  climatology 

is  based  on  a  15-day  period  starting  in  January  but  including  contributions 
from  situations  up  to  mid-February.  From  the  results  for  each  January  a 
mean  curve  is  calculated  for  the  all-Januaries  persistence  climatology. 

The  climatology  to  be  used  for  verification  of  RASS  is  the  monthly 
climatology  appropriate  to  the  baseday.  This  monthly  climatology  has  been 
derived  for  each  month  (12),  by  each  component  field  (9),  and  by  each 
degree  of  resolution  plus  one  all-resolution  category  (4).  In  all  cases  the 
modules  incorporated  in  the  climatology  are  those  appropriate  to  the  range- 
of-scale  and  resolution  considered;  these  modules  are  shown  in  Figs.  15, 
16  and  17. 

In  addition  to  monthly  values,  seasonal  and  annual  persistence 
climatologies  have  also  been  derived.  Figure  22  shows  an  example  of  an 
all-years  persistence  climatology  using  equal  weight  factors  (  =  1  )  at 
each  selection  gate;  it  thus  falls  in  the  all-resolution  category.  (Note  that 
in  this  figure  the  match  coefficient  is  the  fraction  of  mis-matchinq  bits  — 
compare  with  Figs.  19-21  which  use  the  fraction  of  matching  bits.) 


MRTt 


8.3 


The  RASS  Verification  Scheme 


The  RASS  methodology  for  selecting,  scoring  and  ranking  analogues 
is  described  in  Section  7.  As  explained,  the  selection  process  includes 
considerations  of  3  ranges-of-scale  and  3  degrees  of  resolution.  Tor  a 
regional  focus,  the  actual  terrestrial  area  involved  in  analogue  selection 
is  a  function  of  both  resolution  and  range-of-scale.  Thus,  for  example, 
Figs.  15-17  show  the  modules  and  associated  ranges-of-scale  and  degrees 
of  resolution  for  the  Greater  Mediterranean  regional  focus.  It  will  be  noted 
that  the  area  involved  in  analogue  selection  is  much  greater  than  the 
Mediterranean  Sea  itself. 

The  analogue  verification  scheme  is  designed  to  operate  on  a  smaller 
area  than  that  involved  in  analogue  selection.  This  area  (in terms  of 
modules)  is  called  the  OBJECT  REGION,  and  is  presently  defined  as  those 
modules,  appropriate  to  each  range-of-scale,  for  which  full  resolution  is 
used  in  analogue  selection.  Thus,  for  the  Greater  Mediterranean  regional 
focus,  the  object  region  modules  are  as  follows: 

SD  range-of-scale.  Modules  no.  67,  68,  89,  90,  91,  93. 

(See  Fig.  15.) 

SL  range-of-scale.  Modules  no.  21,  24.  (See  Fig.  16.) 

SV  range-of-scale.  Modules  no.  7,  8,  13,  14.  (See  Fig.  17.) 

Only  these  modules  are  used  in  the  RASS  verification  scheme. 

The  verification  process  currently  is  performed  out  to  eight  days  from 
the  time  of  the  selected  baseday;  this  verification  period  can  be  varied. 
There  are  basically  three  stages  involved  in  the  verification  procedure: 
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a. 


Persistence  Verification.  For  a  selected  baseday  of  time  r 
(BDt)  the  ensuring  events  are  matched  against  BD  .  Thus, 
using  the  nomenclature  previously  explained: 

BDt+x  :  BDt  ,  x  =  0  -  8 

This  score  shows  the  effectiveness  of  persistence  forecasting, 
i.e.  ,  the  effect  of  assuming  that  the  baseday  situation  remains 
unchanged  for  8  days. 

b*  Climatology  Verification.2  The  baseday,  BD^  ,  and  its  ensuing 

scenario,  is  matched  against  the  climatology  appropriate  to  the 

calendar  month  (C___  )  of  the  baseday.  Thus: 
dL) 


BD 


r+x 


x  =  0  -»  8 


This  score  shows  the  effect  of  assuming  that  climatologically 
normal  conditions  will  prevail  for  the  next  8  days. 

c*  Analogue  Verification.  The  day-by-day  evolution  of  each  of 
the  top  N  analogues  (where  N  can  be  specified)  is  matched 
against  the  evolution  of  the  baseday  situation.  Thus: 


Sn,T'+x  :  BDr+x 


n  =  1  -*  N  ,  x  =  0->8 


where  T'  is  the  date-time  of  the  selected  analoque  S  . 

n 

Examples  of  verification  records  are  given  in  Sections  8.4.1 
and  8.4.2. 


As  used  in  this  sense,  climatology  refers  to  the  mean  (in  time)  of 
the  component  fields. 
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Verification  of  course,  can  only  be  carried  out  using  historical 
information.  Thus  in  an  operational  mode  the  verification  scores  of 
analogues  selected  for  a  particular  baseday  will  not  be  available  until 
8  days  later. 

In  the  RASS  verification  scheme,  records  are  stacked  as  they  are 
produced  and,  once  an  adequate  sample  has  accumulated,  various  statistical 
measures  can  be  produced  to  show  the  performance  of  analogue  selections 
over,  for  example,  the  previous  month. 

8*4  Demonstration  of  Current  RASS  Capabilities 

The  current  capabilities  of  RASS  are  demonstrated  by  application  to 
3 

two  scenarios  for  the  Greater  Mediterranean  region  of  focus  chosen  from 
historical  records.  The  first  demonstration  is  based  on  the  scenario  ensuing 
from  12Z  22  AUG  69  and  is  presented  in  Section  8.4.1;  the  second 
demonstration,  presented  in  Section  8.4.2,  is  based  on  the  scenario 
ensuing  from  12Z  18  OCT  75. 

While  studying  the  tables  and  charts  presented  for  each  demonstration, 
the  following  points  should  be  kept  in  mind: 

a.  As  discussed  in  Section  8.1.2,  because  of  the  inadequacy  of  the 
data  base  with  regard  to  SD  features ,  little  weight  can  be  given 
to  this  range-of-scale.  In  general,  therefore,  a  good  match  for 
SD  features  is  less  likely  than  for  the  more-strongly  accentuated 
SL  features . 


3 

Dates  specified  by  NEPRF. 
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b.  Analogue  selection  is  based  on  scenario-matching ,  discussed 
in  Section  7.6,  using  a  "double-ended  time  tunnel"  of  the  type 
shown  in  Section  7.6.2.  In  using  this  time  funnel  however, 
the  "forecast  future"  shown  on  the  diagram  was  available  from 
historical  records.  A  suitable  input  to  such  a  time  funnel  is 
shown  on  page  58,  but  the  current  data  base  does  not  contain 
all  the  required  analyses.  The  procedure  adopted  to  circumvent 
data  base  deficiencies  was  to  assume  that  any  missing  field 
was  identical  to  the  last  available  analysis  of  that  field.  The 
effect  of  this  assumption  of  persistence  is,  of  course, 
particularly  severe  when  matching  the  SD  range-of-scale. 

c.  The  analogue  scenarios  presented  in  chart  form  (two  scenarios 
for  each  baseday  scenario)  must  not  be  regarded  as  deterministic. 
The  scenarios  given  should  be  regarded  as  being  only  two 
examples  of  a  set  of  possible  scenarios  evolving  from  initial 
conditions  similar  to  those  of  the  baseday  scenario.  (The  other 
possible  scenarios  have  not  been  included  in  this  Report  due  to 
space  limitations.)  The  set  of  possible  scenarios  would  be 
used,  for  example,  to  compile  a  forecast  of  surface  winds  in 
probabilistic  terms — there  are  many  other  potential  uses. 

d.  The  term  "climatology"  used  in  each  of  the  two  verification 
summaries  refers  to  the  monthly  mean  field  for  each  of  the  nine 
component-fields.  This  mean  field,  of  course,  is  relatively 
flat  and  featureless  and  has  no  utility  in  generating  weather 
information — no  more  than  weather  information  for  the 
Mediterranean  can  be  produced  from,  say,  a  monthly-mean 
chart  of  sea-level  pressure.  Comparison  of  the  fields 
representing  an  actual  meteorological  situation  with  the  mean 
fields  merely  yields  a  measure  of  the  degree  of  pattern 
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similarity  between  these  two  sets  of  fields;  this  measure  has 
little  significance. 


The  two  demonstrations  are  presented  without  discussion  of  the  tables 
or  charts  as  significant  similarities  and  differences  are  readily  apparent  by 
visual  inspection. 

The  first  table  in  each  Section  shows  the  top  25  analogues.  For 
each  selection,  two  rows  of  figures  are  given;  the  upper  row  shows  scores 
based  on  the  baseday  situation ,  while  the  lower  row  shows  scores  based 
on  the  baseday  scenario.  CThe  ordering  of  the  selections  was  based  on  final 
scenario  scores.)  All  scores  are  given  in  parts  per  1000  (i.e.,  %  x  10). 

The  first  9  columns  show  the  scores  of  unweighted  matching  bits  for  each  of 
the  nine  component  fields.  The  sum  of  the  unweighted  bits,  normalized  to 
1000,  is  shown  in  column  10.  The  final  column  shows  the  final  score  based 
on  the  weighted  sum  of  matching  bits,  again  normalized  to  1000. 

As  will  be  noted  the  top  eight  analogues  for  the  summer  case  and  the 
top  nine  analogues  for  the  winter  case  were  part  of  the  same  sequence  as 
the  baseday  and  these  cases,  therefore,  have  been  excluded  from  the 
verification  summaries.  These  are  given  in  the  next  four  tables  in  each 
Section  and  show,  for  the  baseday  to  baseday+8  days,  the  scores  for  each 
of  the  nine  component  fields.  These  scores  are  in  terms  of  unweighted 
matching  bits,  apart  from  the  final  row  which  is  the  weighted  sum  of 
matching  bits.  For  each  verification  summary  the  mean-field  climatology 
is  given  first  (see  paragraph  d  above),  followed  by  persistence.  Then 
follow  the  scores  for  the  ten  scenarios  selected  as  most  closely  resembling 
the  baseday  scenario. 
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Each  demonstration  presents  charts  for  3  scenarios  showing  the 
SL500,  SL1000,  SD500  and  SD1000  component  fields  at  day  0,  day  2  and 
day  5.  The  first  scenario  in  each  case  is  for  that  of  the  baseday  followed 
by  two  scenarios  chosen  from  the  list  of  analogue  selections.  These  are 
as  follows: 


Section  8.4.1 


Section  8.4.2 


1st  Scenario 

12Z  22  AUG  69 
(baseday) 


12Z  18  OCT  75 
(baseday) 


2nd  Scenario 

12Z  04  SEP  52 
(selection  10) 


12Z  26  APR  72 
(selection  11) 


3rd  Scenario 

12Z  15  JUL  66 
(selection  11) 


12Z  10  NOV  68 
(selection  12) 
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8.4.1  RASS  Demonstration  1:  Baseday  12Z  22  AUG  69 


List  of  contents: 

Analogue  Selection  Table:  page  77 

Verification  Summary  :  pages  78-81 

1st  Scenario  (baseday)  :  pages  82-93 

2nd  Scenario  :  pages  94-105 

3rd  Scenario  :  pages  106-117 

(To  facilitate  study  of  the  charts,  each  scenario  has  been  separated  from 
the  next  by  an  unnumbered  yellow  insert;  within  each  scenario  sets  of 
component  fields  are  separated  by  an  unnumbered  blue  insert.) 
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APPENDIX  A 


SCALE-AND- PATTERN  SPECTRA 
AND  DECOMPOSITIONS 

Two  of  the  fundamental  concepts  in  the  interpretation  of  meteorological 
fields  are  those  of  pattern  and  scale.  In  1963,  Mil  developed  an  objective 
technique*  for  separating  any  geophysical  field  into  recognizable 
characteristic  patterns,  or  features,  evident  in  the  field,  so  that  their 
relative  contributions  to  the  total  can  be  quantitatively  represented. 

Using  the  500-mb  height  field  (1  IT)  as  an  example,  this  may  be 
decomposed  into  additive  component  rangcs-of-scale  expressed  by: 

HT  =  SD  +  SR 

=  SD  SL  +  SV 

where  SD  is  the  Disturbance  range-of-scalo  component 
SR  is  the  Residual  range-of-scale  component 
SL  is  the  Long-wave  range-of-scale  component 
SV  is  the  Planetary  Vortex  . 

By  definition,  SR  =  SL  +  SV  . 

Figure  A1  shows  the  500-mb  height  analysis  for  12Z  on  21  OCT  64. 
Decomposing  this  field  into  its  inherent  ranges-of-scale  yields  the  SV 
field  shown  in  Fig.  A2 ,  the  SL  field  shown  in  Fig.  A3  and  the  SD  field 
shown  in  Fig.  A4. 

*  Manfred  M.  Iloll,  Scalc-and-pattern  spectra  and  decompositions. 
Technical  Memorandum  No.  3,  Contract  N228-(62271)60550,  Meteorology 
International  Incorporated ,  Monterey,  California,  1963. 
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